CU Amiga Super CD-ROM 21

home *** CD-ROM | disk | FTP | other *** search

/ CU Amiga Super CD-ROM 21 / CU Amiga Magazine's Super CD-ROM 21 (1998)(EMAP Images)(GB)[!][issue 1998-04].iso / CUCD / Online / AutoPage / AutoPage.readme < prev next >

Wrap

Text File | 1998-02-01 | 3.4 KB | 83 lines

Short: Arexx script to collect html documents. v0.6 Author: Arne Seime <aseime@iname.com> Uploader: Arne Seime <aseime@iname.com> Type: comm/www Requires: HTTPJ (comm/www/HTTPJ200.lha), rexxsupport.library Version: 0.4 Replaces: AutoPage0.4.lha The script is freeware, but feel free to send me an email if you use it or has bug reports/suggestions. Idea: Cut the phonebill costs. Result: Simple and probably buggy arexx script to collect html pages. Works for me. You probably have a lot of html pages you check every time you are online to see if there has been any changes. I do. The script uses HTTPJ to check for updates/changes, and if found, it gets the page. Results are presented in a html page. Installation: Get hold of HTTPJ and place the executable in the same directory as AutoPage.rexx, httpj.rexx and sitelist.txt. rexxsupport.library (in lowercase) should be placed in sys:libs. Autopage.prefs should be in ENV: (And ENVARC: ofcourse) Configuration: From now on, I've included a prefs file. Should look like this: [Chopped rigth from the script] Say "<savedir> /* Directory to save pages in*/" Say "<progdir> /* Directory where program files are located */" Say "<connections> /* Number of connections to run at the same time */" Say "<loop delay> /* Time in 1/50 sec. Time to wait for ready connection */" Say "<buffers> /* Download buffer in kb each connection */" Add/remove sites in sitelist.txt as you want to with an editor. The format should be like this: URL IMAGES SHOW URL: The address. Dont forget to remove the protocol ("http://"). IMAGES: Get images as well as the html page. 0 means no, 1 yes. SHOW: Present the result. Useful to turn off when the page is a part of a frameset. This is because HTTPJ don't seem to handle frames at all. Example: The IBrowse support page will be like this: www.omnipresence.com/ibrowse/index.html 1 1 www.omnipresence.com/ibrowse/menue_f.html 1 0 www.omnipresence.com/ibrowse/home_f.html 1 0 This will get the whole thing, but not bother you with two extra items on the result html page. Also be aware that some servers always present their pages as "new" ones, and therefore HTTPJ get them even if they really are the same as last time you checked. Another problem I've come across is that HTTPJ on certain pages gets images even when I tell it not to. I think this is a bug in HTTPJ, and I've tried to get in touch with the author Piergiorgio Ghezzo, but with negative result. If anyone knows how to get in touch with him via email, please mail me the address. Future: Make it work from IBrowse. Add direct link to the original page. Probably a lot of bug fixes. [Done. Well, at least two of them :)] Use browser hotlist as sitelist. Get more than one page at a time. [Done] History: Version 0.6, sencond release. [Chopped from the script] ** CHANGES SINCE 0.4: ** - Added progdir so script can be run from any path, not just current dir. ** - Fixed a time convertion bug that occured if a disk file was dated ** xx:00:xx. Strange I didn't discovered it before :) ** - The result page are temporary stored in ram to keep disk fragmentation down. ** - sitelist.txt was not closed when exiting. ** - Added possibility to recieve several pages at a time. ** - Added a prefs file - a few more options. Disclaimer: My bad "Sunday-after-a-real-though-Saturday-Night-English".